Water-Quality Data Imputation with a High Percentage of Missing Values: A Machine Learning Approach

نویسندگان

چکیده

The monitoring of surface-water quality followed by water-quality modeling and analysis are essential for generating effective strategies in surface-water-resource management. However, worldwide, particularly developing countries, studies limited due to the lack a complete reliable dataset surface-water-quality variables. In this context, several statistical machine-learning models were assessed imputing data at six stations located Santa Lucía Chico river (Uruguay), mixed lotic lentic system. challenge study is represented high percentage missing (between 50% 70%) temporal spatial variability that characterizes competing algorithms implement univariate multivariate imputation methods (inverse distance weighting (IDW), Random Forest Regressor (RFR), Ridge (R), Bayesian (BR), AdaBoost (AB), Hubber (HR), Support Vector (SVR) K-nearest neighbors (KNNR)). According results, more than 76% outcomes considered “satisfactory” (NSE > 0.45). performance shows better results inside reservoir those positioned along mainstream. IDW was model with best RFR, HR SVR. approach proposed expected aid water-resource researchers managers augmenting datasets overcoming issue increase number future related matter.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Quality Improvement by Imputation of Missing Values

Having missing values in a data set is very common due to various reasons including human error, misunderstanding and equipment malfunctioning. Therefore, imputation of missing values is important to improve the quality of a data set. In our previous study we presented an imputation technique called DMI, which we then found better than an existing technique called EMI in terms of a few commonly...

متن کامل

Imputation of Missing Data Using Machine Learning Techniques

A serious problem in mining industrial data bases is that they are often incomplete, and a significant amount of data is missing, or erroneously entered. This paper explores the use of machine-learning based alternatives to standard statistical data completion (data imputation) methods, for dealing with missing data. We have approached the data completion problem using two well-known machine le...

متن کامل

Missing Values Imputation Based on Iterative Learning

Databases for machine learning and data mining often have missing values. How to develop effective method for missing values imputation is an important problem in the field of machine learning and data mining. In this paper, several methods for dealing with missing values in incomplete data are reviewed, and a new method for missing values imputation based on iterative learning is proposed. The...

متن کامل

Missing Values with iterative imputation

In this paper, the author designs an efficient method for imputing iteratively missing target values with semiparametric kernel regression imputation, known as the semi-parametric iterative imputation algorithm (SIIA). While there is little prior knowledge on the datasets, the proposed iterative imputation method, which impute each missing value several times until the algorithms converges in e...

متن کامل

Modified Deviation Approach to Deal with Missing Attribute values in Data Mining with Different percentage of Missing Values

Information System having missing attribute values (in practical) hampers accurate estimation of Data Mining. If missing attribute values can be predicted in the pre-processing stage of data mining then it will help to improve the accuracy, and the existing data mining algorithms can also be applied based on complete data. In this work different type of methods available to handle incomplete in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Sustainability

سال: 2021

ISSN: ['2071-1050']

DOI: https://doi.org/10.3390/su13116318